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Abstract: The Electron Cyclotron Resonance (ECR) ion source is a critical device for producing 
highly charged ion beams in various applications. Analyzing the charge-state distribution of the ion 
beams is essential, but the manual analysis is labor-intensive and prone to inaccuracies due to impurity 
ions. An automatic spectrum recognition system based on intelligent algorithms was proposed for 
rapid and accurate chargestate analysis of ECR ion sources. The system employs an adaptive window- 
length Savitzky-Golay (SG) filtering algorithm, an improved automatic multiscale peak detection 
(AMPD) algorithm, and a greedy matching algorithm based on the relative distance to accurately 
match different peaks in the spectra with the corresponding charge-state ion species. Additionally, a 
user-friendly operator interface was developed for ease of use. Extensive testing on the online ECR 
ion source platform demonstrates that the system achieves high accuracy, with an average root mean 
square error of less than 0.1 A for identifying charge-state spectra of ECR ion sources. Moreover, the 
system minimizes the standard deviation of the first-order derivative of the smoothed signal to 81.1846 
A. These results indicate the capability of the designed system to identify ion beam spectra with mass 
numbers less than Xe, including Xe itself. The proposed automatic spectrum recognition system 
represents a significant advancement in ECR ion source analysis, offering a rapid and accurate 
approach for charge-state analysis while enhancing supply efficiency. The exceptional performance 
and successful implementation of the proposed system on multiple ECR ion source platforms at 
IMPCAS highlight its potential for widespread adoption in ECR ion source research and applications. 
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1 Introduction 


The ECR (Electron Cyclotron Resonance) ion source, initially developed by Prof. Geller and colleagues at Grenoble 
Laboratory in France [1, 2], has evolved into a highly efficient and highly charged state ion source with a diverse 
range of beam types. It is renowned for its exceptional stability and reproducibility and has found widespread 
application in heavy ion acceleration facilities worldwide. For instance, the VENUS ion source at Lawrence Berkeley 
National Laboratory (LBNL) provides beam current for the 88-inch cyclotron [3], the SC-ECR ion source at the 
Institute of Physical and Chemical Research (RIKEN) in Japan supplies beam current for the Radioisotope Facility 
(RIBF) [4], and the SECRAL ion source at the Institute of Modern Physics (IMP) in Lanzhou, China, serves as the 
beam current provider for the Heavy Ion Research Facility (HIRFL) [5]. The beam transmission lines of the ECR 
ion source are illustrated in Fig. 1, where ions of varying charge-to-mass ratios (q / M ) exhibit distinct deflection 
trajectories in the dipole magnet, resulting in different peak positions in the spectrum. This property allows for 
separating and sorting ions based on their charge-to-mass ratio, a crucial aspect of the transport system in analyzing 
and manipulating ion beams in ECR ion source research and applications. The spectrogram of the beam reflects the 
energy distribution of the ions. Accurate and rapid identification of the charge-state of the injected ion beam is crucial 


for optimizing the beam supply efficiency of the ion source. However, the traditional manual identification method, 
relying solely on the experience of engineers, entails complex calculations and comparisons and is susceptible to 
misidentification or omission due to system background noise. Furthermore, manual spectrum recognition has 
limitations such as a high technical threshold, prolonged time consumption, and a low accuracy rate [6-9]. To address 
these challenges, this paper presents a novel automatic spectrum recognition system for beam charge-state 
distribution spectra, featuring three key algorithms: the Savitzky-Golay (SG) filtering algorithm with adaptive 
window length, the improved automatic multiscale peak detection (AMPD) algorithm, and the greedy matching 
algorithm based on relative distance. These intelligent algorithms collectively enable fast and accurate spectra 
recognition while reducing human involvement, lowering the technical threshold, and offering significant research 


value. 
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Fig.1 | ECR Beam Transport System. In this system, the beam deflection was achieved using a dipole magnet. 
The trajectory of the deflected beam varied based on the charge-to-mass ratio of the ions. The dipole magnet induced 
a magnetic field that caused the charged particles in the beam to experience a Lorentz force, resulting in a curved 
path. The curvature of the trajectory depends on the charge-to-mass ratio of the ions, with lighter ions exhibiting 
larger curvatures than heavier ions. 

Influenced by environmental background noise, the spectrogram signal of the beam is often contaminated with noise. 
Several filtering algorithms have been proposed to mitigate this issue, including wavelet denoising [10], empirical 
modal decomposition (EMD) [11], Savitzky-Golay (SG) filtering [12], Legendre filtering [13], kernel regression 
[14], local extremum center [15], and signal sparsity-based denoising [16]. Among these, SG filtering has been 
widely utilized in diverse fields, such as digital control systems [17], electrocardiogram denoising [18], and nuclear 
electrical reaction calculations [19]. Preserving the signal shape and peak properties in SG filtering [20] makes it 
particularly appealing for the current study based on previous research. SG filtering is known to effectively filter 
noise and outliers while preserving the underlying trend and periodicity of the signal. 

The first crucial step in spectrogram identification is the accurate and comprehensive detection of spectral peaks. 
Peak detection in signal processing plays a pivotal role in obtaining reliable results. Various algorithms have been 
developed for peak detection, including traditional window-threshold methods [21], wavelet transforms [22], 
template techniques [23], hidden Markov models [24], and others. However, most of these peak-finding algorithms 
suffer from the challenge of setting hyperparameters before their application. In practice, it has been observed that 
many commonly used algorithms require a significant number of hyperparameters to be pre-set, leading to varying 
results in peak detection even within the same spectrum due to different combinations of window length and peak 
height threshold. Determining the optimal parameter combination is not universally applicable, making the tuning 
process laborious and challenging. An alternative approach that addresses these limitations is the automatic 


multiscale peak detection (AMPD) algorithm, which sets multiple window scales for peak detection based on the 
input signal characteristics. This method does not rely on prior knowledge or require frequent tuning, making it more 
suitable than other algorithms for detecting peaks in a spectrum. 

The identification of beam spectrograms can be achieved by matching the identified peaks with the beam charge 
states [25]. The performance of the matching algorithm directly affects the accuracy of spectrogram identification 
results. A commonly used approach is the greedy matching algorithm, which determines the optimal local resolution 
for each subproblem and combines the results to obtain a final global solution [26]. The essential advantage of this 
algorithm lies in its high efficiency and ability to quickly obtain a global solution through locally optimal solutions 
[27]. Greedy algorithms have been widely employed in various fields, such as task scheduling problems in 
optimization [28], minimum spanning tree problems in graph theory [29], and queuing problems in queuing theory 
[30]. Despite the effectiveness of greedy algorithms in obtaining locally optimal solutions, it is important to note 
that they may not always find globally optimal solutions due to their lack of backtracking. However, greedy 
algorithms generally yield satisfactory results and are commonly used in practical applications. 

This study presents a novel approach for denoising spectrograms by introducing an adaptive window length SG 
filtering algorithm. The proposed algorithm can effectively remove background noise from the original spectrogram. 
By automatically determining the optimal window length, our algorithm overcomes the limitations of existing 
methods that require manual hyperparameter tuning, making it more practical and user-friendly. In addition, an 
improved AMPD algorithm is developed for spectral peak calibration. One notable advantage of our algorithm is its 
ability to automatically eliminate false peaks without needing prior hyperparameter settings, improving peak 
detection accuracy and reliability in spectrograms. Furthermore, the limitation of the greedy algorithm in global 
optimization is addressed by proposing an interquartile range (IQR) anomaly detection mechanism [31] based on 
relative distance. This mechanism aims to identify and reject solutions in the solution set that satisfy the greedy 
strategy but are outliers in the global context. By mitigating the limitations of the greedy algorithm, the accuracy 
and robustness of our algorithm in identifying spectral peaks are enhanced. The accuracy of the final identification 
results is evaluated and verified in this paper using a discriminative empirical formula. The experiment results 
demonstrate the effectiveness of our proposed algorithms in achieving high-quality denoising, accurate peak 
detection, and precise charge identification in spectrograms. The automatic spectrum recognition system presents a 
promising approach to analyzing spectrograms in various applications. 


2 Architecture of Automatic Spectrum Recognition System 


A human-computer interface is built into this system, and the operator can specify and send down the relevant 
parameters. After receiving the parameters from the front end, the back-end program will first scan the spectrum 
according to the start current, the stop current, and the scan step. Two critical parameters must be collected in the 
scanning process: the dipole magnet current and the Faraday cup intensity. In this paper, the dipole magnet current 
is acquired through a resistive shunt using an AMETEK SGA high-power programmable DC power supply [32] with 
a current acquisition range of 5 A to 6000 A. The Faraday cup intensity is acquired using an Agilent 34410A high- 
performance digital multimeter with a "6 1/2" resolution and a sampling frequency of 1 KHz [33]. The dipole magnet 
current values are rounded to four decimal places in amperes (A), and the Faraday cup intensity values are rounded 
to two decimal places in electrostatic microamperes (eA) to ensure ease of calculation and legibility. The collected 
dipole magnet currents and Faraday cup intensities are transmitted to the data processing module. The signal is first 
smoothed and filtered in this module using the Savitzky-Golay (SG) filtering algorithm with adaptive windows, after 
which the smoothed signal is searched for using an improved automatic multiscale peak detection (AMPD) algorithm. 
Moreover, a greedy matching algorithm based on relative distance is proposed to achieve the matching between peak 
and ion charge states. Finally, the final matching results are visually outputted after outlier rejection. The 
implementation of each algorithm will be described in detail in the following. The overall system architecture is 
shown in Fig. 2. The final results are displayed on the Qt interface, a widely used graphical user interface (GUI) 
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toolkit for creating interactive software applications [34]. The data is processed and displayed by the Qt interface 
clearly and understandably to enable the system’s effective operation, allowing the operator to monitor the data 
processing results concisely and change the parameters in time for sending. 
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Fig.2 The overall architecture of the beam charge-state distribution spectrogram identification system. Once the 
parameters specified by the user through the human-computer interface are acquired, the data is obtained in real time 
by reading and writing the process variable (PV) value. The acquired data, including the dipole magnet currents and 
Faraday cup intensities, is then processed using the data processing module. 

The beam intensity is measured using a high-precision multimeter, while the dipole magnet power supply is 
responsible for collecting the current of the dipole magnet with an original hardware sampling rate of 1 KHz. Data 
downsampling is necessary to ensure synchronized data and to accommodate the architecture of the ECR ion source 
control system. After issuing a dipole current write command during a set sweep step, the dipole magnet current and 
Faraday cup intensity values were read back after a delay of 0.1 seconds. This interval allowed the dipole magnet 
sufficient time to respond to the control command. Consequently, the sampling rate is reduced to 10 Hz to ensure 
that the dipole magnet current and Faraday cup intensity match [35]. This process is demonstrated in Fig. 3. 
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Fig.3 Data acquisition and synchronization process. 


2.1 Design of SG filtering algorithm with adaptive window length 


According to the design characteristics of the SG filtering algorithm, a larger window length is associated with 
improved robustness against noise rejection and reduced variance in the error deviation for a given filter order. 
However, excessively large window lengths can result in distortion and bias in the filter output compared to the 
actual signal. Therefore, finding the optimal parameters for the SG filtering algorithm involves balancing bias and 
variance in the error estimation. The parameters of the SG algorithm with an adaptive window length include the 
order n of the polynomial and window length of N=2M+1. Within each window, a cost function at the following 
should be minimized, given by Eq. (1). 
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In Eq. (1), a, represents the k th polynomial coefficient. In this case, the output of SG filtering can be expressed as: 
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where W,,(j) is the jth sample of the continuous function W, (z)and W, (z) is a polynomial that can be defined 
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Similarly, q, (z) is a shifted Chebyshev polynomial obtained from the n th forward difference A” . This instance is 


q, (z)=n14" a p) 6) 


An SG filter of order n and window length N is used to reconstruct the signal f (t) contaminated with noise. When 


shown in Eq. (6). 


the signal f(r) is sufficiently smooth such that there are n+2 consecutive derivatives, the optimal window length 
of the SG filter [36] can be approximated using Eq. (7) as follows: 
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is the variance, and v, can be considered a function of the signal correlation, where v, /o” can be 


(7) 


where o° 


interpreted approximately as the signal-to-noise ratio (SNR); the lower SNR to a larger optimal window length. 
In this algorithm, the parameters utilized in this process encompass the original signal input denoted as 


x=jx , the window length N, 


pi > the filter order n , as well as defining the filter output as 


y =| yy- i . To calculate v, , the original signal x is filtered by SG as the initial step. Because of the reduced 
computational workload, the original window length is defined to be five here, and the filter order is two. The 
resulting output y after the first SG filter is obtained as its first-order derivative. Subsequently, SG smoothing is 


iteratively performed, with the smoothed window length denoted as N, = 2| Nop / 2|. The iterative application of 
the SG filtering algorithm is crucial in enhancing the estimation accuracy of the desired parameter v, . By repeatedly 
filtering the original signal with SG, the algorithm refines the estimate of v, , leading to a more precise and reliable 


result. The optimal window length N,,,, determined through an iterative process based on Eq. (7). The iterative 


evaluation of N is achieved, indicating the termination of the process. This adaptive 


opt 


continues until N, = Nop 


approach ensures that the SG filtering algorithm adjusts the window length based on the specific characteristics of 
the input signal, resulting in an optimized window length that enhances the accuracy of the filtered signal. 


2.2 Design of the improved AMPD algorithm 


The improved AMPD algorithm proposed in this study enhances the original AMPD algorithm by incorporating 
constraints on the peak output. The original AMPD algorithm employs a moving window method to find the local 
maxima of the signal, where the window length, denoted as w, , can be expressed as {w, = 2k|k =1,2---, L} , where 
L isa parameter associated with the range of the signal [37]. However, if L is set excessively large, certain spectral 
peaks may remain unidentified. On the other hand, choosing L as too small may result in the inclusion of "false 
peaks" in the results. 

To ensure the complete detection of all spectral peaks of interest, the parameter L is designed in this study as 
L=|N/10]-1, where N represents the signal range. While the results accurately identify all spectral peaks of 
interest, they may also contain spurious peaks. For instance, during beam pauses or when there is a lack of beam 
flow, the Faraday cup intensity in the optimal state should ideally be 0 eA. However, due to the influence of 
disruptions from the device itself, the monitored Faraday cup intensity at this time may fluctuate slightly above and 
below 0 euA. Nonetheless, these false peaks are less significant compared to the spectral peaks formed by the beam 
current, which requires attention. As the peak positions and peaks in the spectrograms of different types of beams 
may vary considerably, it was not feasible to establish a general significance level threshold. To avoid introducing 
additional hyperparameters into the algorithm, the original AMPD algorithm's peak output is improved in this study 
by specifying that only peaks with a significance level higher than 2% of the spectrogram based on the highest peak 
value are considered beam spectrum peaks. This empirical threshold helps filter out false peaks and retain only the 
peaks most likely to represent actual spectral peaks of interest. The process of the algorithm at this stage is as follows: 
firstly, the local maximum scale (LMS) is calculated based on parameter L . Secondly, the row-wise sum of the LMS 
is computed, and the LMS is reconstructed based on the minimum value of the row sum. In the third step, the peak- 
seeking output of the AMPD algorithm is obtained. Finally, a new peak sequence is re-output after filtering the output 
peaks according to the empirical threshold. The calculation process is illustrated in Fig. 4: 
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Fig.4 Calculation steps of the improved AMPD algorithm 


2.3 Design of the Greedy algorithm based on the relative distance 


In this study, a greedy matching algorithm based on the relative distance was developed to address the issue of 
matching peaks to charge states. This algorithm follows a greedy strategy compatible with the physical properties of 
the ion source. The difference between the peak detected in the previous section and the calculated theoretical value 
(according to Eq. (8)) was determined. The matches that met the requirements for each peak were identified by 


screening according to the greedy strategy, and greedy matching was completed based on this screening process. 
Considering the limitations of the greedy algorithm, which only involves local optimal solutions and cannot be 
backtracked, the concept of relative distance is introduced in this study. As the change in the charge-state of each ion 
in the same spectrogram is usually continuous, the first recognition result of the greedy algorithm was classified 
according to the ion species, and the coded charge states of the classified ions were obtained. Finally, the first 
recognition result is further refined using interquartile range (IQR) anomaly detection to reject outliers and obtain 
the final recognition result. This step helps improve the accuracy and reliability of the charge-state identification 
process. The specific process of the algorithm used in this study is as follows: 

The standard peak C of the total charge-state was initially calculated based on the empirical formula of the ion 
source within a specified sweep range, using Eq. (8). In this equation, 4 represents the constant of the dipole magnet 


resolution system of the ECR ion source, U denotes the extracted high voltage, and M /q represents the ion mass- 


C=u Za (8) 
q 


The set of peaks computed in the previous section is denoted as P = Ri PeP, \ , and the set of charge-state standard 


to-charge ratio. 


peaks is denoted by C ={C,,C,---C,}. These two sets are encoded separately. The next step is to devise a greedy 


strategy that generates a set of solutions that adhere to the greedy strategy. In this paper, the chosen greedy strategy 
is based on the resolution of the dipole magnet against the spectral peaks, and it is defined as follows, where s 
means the minimum distance index: 


—s)<0,1 (9) 


Once an appropriate greedy strategy is determined, the next step is to calculate the relative distance between each 
peak and its corresponding theoretical value. The results were organized into a relative distance matrix, denoted as 


L (Eq. (10), where Ly.) =|P, -Ce 
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The i -th peak, which corresponds to the minimum distance index s , is filtered based on L , where 


Lii.s) = argmin( Lin Lio) lao)” Then, the set of solutions Rya ={R,,Ry--R,} is expanded according to the 


greedy strategy, where R, = TE Lig} í 


After obtaining the set of raw solutions, the next step in the algorithm is to remove outliers from this set. First, the 
original codes were classified according to the ion species and then normalized using the robust_stable method. This 
method eliminates the magnitude of the data and makes them comparable [38]. This helps to account for any 
variations in magnitude or scale among the different ion species, allowing for more robust outlier detection. Finally, 
the outliers were rejected after anomaly detection using the IQR, as described in Eq. (11). The IQR is a measure of 
the dispersion or spread of data and is calculated as the difference between the 75th ( Q} ) and 25th percentiles ( Q; ) 
of the data [39]. This outlier rejection step helps remove any spurious or erroneous solutions from the data, ensuring 
that only reliable and accurate solutions are considered in the subsequent analysis or processing steps of the algorithm. 


IQR =Q; -Q; 
Q min = Q z kIQR (l 1) 
O rias = Q; T KIQR 
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In Eq. (11), Qna; represents the upper quartile and Qin represents the lower quartile. When an observation does not 


min 


satisfy Qnin <Q <Qnnax> it is considered an outlier and must subsequently be removed from the solution set [40]. The 


final solution set is Rye =| Riser Rect’ Riau | . Fig. 5 illustrates the flow schematic of the greedy matching 


algorithm based on relative distance. The detailed procedure for this algorithm is presented in Algorithm 1. This 
algorithm utilizes the concept of relative distance and incorporates the IQR method for outlier rejection to improve 
the accuracy and reliability of the matching process for identifying the charge states [41] of ions in the spectrogram. 
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Fig.5 The flow diagram of the greedy matching algorithm based on relative distance 


Algorithm 1 Greedy algorithm based on relative distance 
Input: Collection of computed values A, Collection of theoretical values B, 
All encoding results OLDInd, Threshold A 
Output: Corrected match result NEWInd 
1: When the difference between the mass-to-charge ratios of the two particles 
is less than a threshold A, it is considered that multiple elements in the 
spectrum belong to the same peak. They belong to the same match. 
2: 1=getColumNum(A) 
3: j=getColumNum(B) 
4: for A[i] in A do 
5 for B[j] in B do 


6 abs(A[i] - BU )= d[iJfj] 

7: Distance = d[i][j] 

8 end for 

9: end for 

10: if Distance <A then 

li: num[i]=j 

12: Ind=num 

13: Index=OLDInd[i][Ind] 

14: else 

15: continue; 

16: end if 

17: while isNotEmpty(Index[i]) do 

18: n = len(Index) 

19: Qi = HH; Q; = Ut 

20: IQR = Q3- Qı 

21: initial k > k is the set anomaly index 
22: if Qi —kIQR<Index(il [j]<Q3+kIQR then > Remove outliers 
23: NEWInd[i] <Index{i][j] 

24: end if 


25: end while 
26: return NEWInd 


3 Automatic Spectrum Recognition System Performance Evaluation and Operational 
Testing 


3.1 Performance analysis of SG filtering algorithm with adaptive window length 


The present study uses accurate Ar and Xe beam data to evaluate the proposed algorithm. The results of applying 
the SG filter algorithm with an adaptive window length to the charge-state distribution spectra of the Ar and Xe 
beams are presented in Fig. 6a and 6b, respectively. The original signal is depicted in pink, and the output of the SG 
filter is shown in red. The figures demonstrate that the proposed SG filtering algorithm with an adaptive window 


length effectively smoothens the signal curve while preserving the original signal characteristics. The adaptive 
algorithm calculates the optimal window length for each position in the spectrum based on the signal features, in 
contrast to the original SG filtering algorithm, which employs a fixed window length. This approach enables filtering 
to preserve more original features while effectively decreasing noise and avoiding signal distortion. The results of 
this study suggest that the proposed algorithm is a promising tool for effectively filtering and analyzing complex 


beam spectra. 
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Fig.6 (a) Original Ar beam signal and SG filter with adaptive window length output. (b) Original Xe beam signal 
and SG filter with adaptive window length output. 

In a previous study [42], the standard deviation of the first-order derivative was introduced as a metric to assess the 
effect of data smoothing. Specifically, a minor standard deviation of the first-order derivative indicated that the signal 
was smoothed to remove noise. To compare the effectiveness of the proposed SG filtering algorithm with adaptive 
window length against the original SG filtering algorithm, we examined the trend of the first-order derivatives in the 
original signal, the output of the ordinary SG filtering algorithm, and the output of the proposed algorithm. Fig. 7a 
and Fig. 7b depict the first-order derivative variations for the Ar and Xe beams, respectively. The blue line represents 
the first-order derivative trend of the original signal with respect to the number of data points. The red line represents 
the trend of the first-order derivative of the signal filtered using the original SG filtering algorithm, whereas the 
green line represents the trend using the proposed algorithm. To ensure a fair comparison, we used the same 
parameter values for the original SG filter as those initially used in the proposed algorithm, namely, n=2 and N=5. 
The results demonstrate that the proposed algorithm outperforms the original SG filtering algorithm by producing a 
smoother first-order derivative trend, indicating better noise reduction and signal preservation. 
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Fig.7 This study proposes an adaptive window length Savitzky-Golay (SG) filter algorithm and compares its 
performance with that of the original SG filter algorithm and the original signal. The trend of the first derivative of 
the output results was analyzed for both Ar and Xe beams. Specifically, we examine the variation in the first 
derivative with the number of data points for each algorithm. (a) shows the variation of the first derivative of the Ar 
beam with respect to the data points. (b) shows the same for the Xe beam. 

The standard deviations of the first-order derivatives of the three signals were computed to compare the performance 
of the proposed SG filtering algorithm with the adaptive window length against that of the original SG filtering 


algorithm and the original signal. Table 1 presents the results of the study. The table shows that The standard 
deviation of the first-order derivative decreases as the signal is filtered, indicating that noise is effectively reduced. 
Table 1 The standard deviation of the first derivative of the signal for the three cases. 


The original The ordinary SG filter The SG filter algorithm with 


signal algorithm adaptive window length 
Ar 91.4372 A 90.5672 A 82.1846A 
Xe 152.5697 A 151.7765 A 147.0195 A 


Specifically, the proposed algorithm achieved standard deviations of 82.1846 and 147.0195 A for the Ar and Xe 
beam spectra, respectively, whereas the original SG filtering algorithm yielded standard deviations of 90.5672 A and 
151.7765 A for the same spectra. The proposed algorithm yields a minor standard deviation compared with the 
original SG filtering algorithm, indicating better signal preservation. These findings support the claim that the 
proposed algorithm offers a better denoising and smoothing effect on the beam spectra than the original methods 
evaluated in this study. 


3.2 Performance analysis of improved AMPD algorithm 


This study aims to evaluate the peak-seeking performance of the improved automatic multiscale peak detection 
(AMPD) algorithm by analyzing two beam spectra, as illustrated in Fig. 8. The Ar beam spectral peak detection is 
presented in Fig. 8a and Fig. 8b, whereas the Xe beam results are displayed in Fig. 8c and Fig. 8d. Owing to the 
denser charge-state distribution in the Xe beam spectrum, more peaks appear within a specific dipole magnet current 
variation range, leading to considerably greater detection difficulties than those encountered in the Ar beam spectrum. 
Nonetheless, the improved detection results of the AMPD algorithm agree with those obtained from physical 
experience, indicating its relative robustness to different beam spectra without the need for additional 
hyperparameters owing to the physical empirical threshold employed for peak filtering. Furthermore, the improved 
AMPD algorithm exhibited excellent performance in detecting the spectral peaks for high-charge states and dense 
distributions. The findings of this study suggest that the improved AMPD algorithm is a promising tool for accurate 
and efficient peak detection in complex spectra, particularly in scenarios involving highly charged states and dense 
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Fig.8 The improved AMPD algorithm proposed in this study was compared with the original AMPD algorithm 


for peak detection results. (a) and (c) show the peak detection results of the original AMPD algorithm for the Ar and 
Xe beams, respectively. In contrast, (b) and (d) show the proposed improved AMPD algorithm results for the Ar and 
Xe beams, respectively. 


3.3 Performance analysis of greedy matching algorithm based on relative distance 


The original greedy algorithm yields a global solution that is merely a stack of all the local optimal solutions and 
may not be the optimal global solution. An IQR-based anomaly detection mechanism using distance encoding was 
proposed to address this issue. First, the distance-encoding output of the initial greedy algorithm based on the ion 
species should be classified. Next, we performed IQR-based anomaly detection and visually represented the 
anomalous values using box-line plots. The anomaly detection results for the Ar and Xe spectra are shown in Fig. 9. 
Specifically, Fig. 9a illustrates the anomaly detection results for the Ar beam spectra, revealing that only one anomaly 
appeared in the identification results of the main gas. In contrast, Fig. 9b shows the results for the Xe beam, where 
we identified two anomalous values for identifying the main gas. 
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Fig.9 This figure illustrates the output results of the Ar and Xe beams obtained using the interquartile range (IQR) 
anomaly detection method. In the figure, the red dots represent the anomalous values, the red line segment denotes 

the median, and the blue dashed line indicates the arithmetic means for each category. Precisely, (a) displays the box 

plot of the Ar beam, and (b) illustrates the box plot of the Xe beam. 

A secondary judgment was made after removing outliers to ensure the accuracy of the final identification results. 

Specifically, in the same spectrum with the same ion mass, the dipole magnet currents and charge states of the two 

adjacent peak positions must satisfy the empirical formula expressed in Eq. (12): 


I 
ey ee (12) 
i, q-l 


The actual value of the /,_,/7, was compared with the theoretical value of Va /,{q-1. The accuracy of each 


matching term for each ion species was separately calculated as the difference between the actual and theoretical 
values. The root mean square error (RMSE) was subsequently computed and averaged for each ionic species. A 
lower RMSE implies a higher accuracy of the algorithm. The formula for calculating the RMSE is shown in Eq. (13) 


[43]. 
ly 2 
RMSE = [EXC -f (%)) (13) 


The RMSE values of the matching results for the Ar and Xe beams are presented in Table 2, where Y, represents the 
actual value, and f(x;) represents the theoretical value in Eq. (13). To evaluate the results, we consider the 
algorithm acceptable when the RMSE < 0.1 A and excellent when the RMSE < 0.05 A [44, 45]. 


Table 2 The root mean square error of the matching results of Ar and Xe beams 


Class GasMain C N O 


RMSE (Ar) 0.0678 A 0.0034 A 0.0028A 0.0024 A 


RMSE (Xe) 0.0880 A as — 0.0115 A 
According to the data in Table 2, all the results fall within the acceptable range. The algorithm performed excellently 
in all categories except GasMain. This suggests that the proposed algorithm satisfies the accuracy requirements for 
beam spectrogram identification and has practical value in engineering applications. 


3.4 Beam charge state spectrogram identification interface test 


A software interface integrating data acquisition, peak finding, and spectrum recognition was developed. The 
interface, shown in Fig. 10, allows users to specify the start and end scan ranges and scan steps. Before peak seeking, 
users must specify the mass number of the main gas and the extracted high voltage of the ion source. 
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Fig. 10 Using simple button clicks, an operator can easily set the beam parameters on the user interface and 
perform spectrum scanning, peak identification, and automatic spectrum recognition functions. (a) illustrates the 
primary interface for Ar beam spectrum identification, and (b) shows the primary interface for Xe beam spectrum 
identification. 

Clicking the "Identify spectrum" button in the main interface will open the identify spectrum interface, which 
displays the kind of ion and charge-state corresponding to each peak, as shown in Fig. 11. Users can zoom in on the 
interface to view the details. The identification results obtained using the system proposed in this study meet the 
accuracy requirements for identifying the Ar (Fig. 11a) and Xe (Fig. 11b) beams. 
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Fig.11 This figure depicts the performance evaluation of the beam charge-state distribution spectrogram 
identification system for the Ar and Xe beams. (a) Spectrum recognition results for the Ar beam and (b) spectrum 
recognition results for the Xe beam. 


4 Conclusion 


In this study, a novel spectral recognition system for beam charge-state distribution of high-charge-state Electron 
Cyclotron Resonance (ECR) ion sources was designed and developed. It enables automatic spectrum recognition 
using three algorithms: Savitzky-Golay (SG) filtering with adaptive window length, improved automatic multiscale 
peak detection (AMPD), and greedy matching based on relative distance. 

The system achieved optimized smoothing with the special SG filtering algorithm introduced in this study and 
accurate peak detection with the improved AMPD algorithm. The proposed greedy matching algorithm effectively 
identified spectral peaks and their ion charge states. The system offers a user-friendly software interface and a real- 
time display of the results. This system effectively and accurately identified the beam spectra of Xe and the spectra 
of the mass numbers below Xe. 

The advantages of the developed automatic spectrum recognition system are as follows: 

1. The SG filtering algorithm with adaptive window length improved smoothing performance. 

2. The improved AMPD algorithm eliminates the need for pre-setting hyperparameters and effectively rejects false 
peaks. 

3. The greedy matching algorithm with relative distance, augmented by the IQR anomaly-detection mechanism, 
ensures accurate and efficient matching, thereby overcoming the limitations of the greedy algorithm. 

4. The user-friendly software interface enables easy parameter specification and real-time display of spectral 
identification results. 

This automatic spectrum recognition system improves the commissioning efficiency of the ECR ion source and 
achieves a precise charge-state ion beam injection, making it a valuable contribution to high-charge-state ECR ion 
source research. 
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